Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up wave.resource module #352

Draft
wants to merge 20 commits into
base: develop
Choose a base branch
from

Conversation

akeeste
Copy link
Contributor

@akeeste akeeste commented Sep 17, 2024

@ssolson This is a follow-up to my other wave PRs and resolves #331. Handling the various edge cases robustly in pure numpy is difficult, so I want to first resolve #331 by using DataArrays throughout the wave resource functions instead of Datasets.

Similar to Ryan's testing mentioned in #331, I found that using DataArrays/Pandas has a 1000x speed up vs Datasets for very large input data. This should restore MHKiT's speed to it's previous state. Using a pure numpy base would have an additional 5-10x speed up from DataArrays, but I think the current work with DataArrays will:

  • be sufficient for our users
  • be easier to develop with
  • be easier to handle edge cases

Before I go forward and apply this change to the rest of the wave.resource module, can you test out energy_period and frequency_moment and try to break them? With the appropriate frequency_dimension input, those functions should handle Pandas Series, Pandas DataFrames, and xarray DataArrays regardless of input shape, dimensions names, dimension order, etc.

@akeeste akeeste marked this pull request as ready for review October 2, 2024 18:34
@akeeste akeeste marked this pull request as draft October 2, 2024 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant